AITopics | classification module

Collaborating Authors

classification module

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Multimodal Learning and Reasoning for Visual Question Answering

Ilija Ilievski, Jiashi Feng

Neural Information Processing SystemsNov-21-2025, 13:52:53 GMT

Typically, a VQA model is comprised of two modules for learning the question and the image representations, and a third module for fusing the representations into a single multimodal representation.

artificial intelligence, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

Asia > Singapore (0.05)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)

Add feedback

EEGDM: EEG Representation Learning via Generative Diffusion Model

Puah, Jia Hong, Goh, Sim Kuan, Zhang, Ziwei, Ye, Zixuan, Chan, Chow Khuen, Lim, Kheng Seang, Fong, Si Lei, Woon, Kok Sin, Guan, Cuntai

arXiv.org Artificial IntelligenceSep-3-2025

While electroencephalogram (EEG) has been a crucial tool for monitoring the brain and diagnosing neurological disorders (e.g., epilepsy), learning meaningful representations from raw EEG signals remains challenging due to limited annotations and high signal variability. Recently, EEG foundation models (FMs) have shown promising potential by adopting transformer architectures and self-supervised pre-training methods from large language models (e.g., masked prediction) to learn representations from diverse EEG data, followed by fine-tuning on specific EEG tasks. Nonetheless, these large models often incurred high computational costs during both training and inference, with only marginal performance improvements as the model size increases. In this work, we proposed an EEG representation learning framework building upon Generative Diffusion Model (EEGDM). Specifically, we developed a structured state-space model for diffusion pretraining (SSMDP) to better capture the temporal dynamics of EEG signals and trained it using Denoising Diffusion Probabilistic Model (DDPM) framework. Subsequently, the resulting latent EEG representations were then used for downstream classification tasks via our proposed latent fusion transformer (LFT). To evaluate our method, we used multi-event datasets covering both interictal epileptiform discharges (TUEV) and seizure (CHB-MIT) detection, and compared EEGDM with current state-of-the-art approaches, including EEG FMs. Empirical results showed that our method outperformed the existing methods. These findings suggested that EEGDM offered a promising alternative to current FMs. Our source code and checkpoint are available at: https://github.com/jhpuah/EEGDM.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2508.14086

Country: Asia > China (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Appendix - Hard-Attention for Scalable Image Classification

Neural Information Processing SystemsAug-15-2025, 08:55:23 GMT

Coverage corresponds to the percentage of the image area that is covered by attended locations.

feature extraction module, module, vector, (16 more...)

Neural Information Processing Systems

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.64)

Add feedback

BERT-based model for Vietnamese Fact Verification Dataset

Tran, Bao, Khanh, T. N., Tuong, Khang Nguyen, Dang, Thien, Nguyen, Quang, Thinh, Nguyen T., Hung, Vo T.

arXiv.org Artificial IntelligenceMar-1-2025

The rapid advancement of information and communication technology has facilitated easier access to information. However, this progress has also necessitated more stringent verification measures to ensure the accuracy of information, particularly within the context of Vietnam. This paper introduces an approach to address the challenges of Fact Verification using the Vietnamese dataset by integrating both sentence selection and classification modules into a unified network architecture. The proposed approach leverages the power of large language models by utilizing pre-trained PhoBERT and XLM-RoBERTa as the backbone of the network. The proposed model was trained on a Vietnamese dataset, named ISE-DSC01, and demonstrated superior performance compared to the baseline model across all three metrics. Notably, we achieved a Strict Accuracy level of 75.11\%, indicating a remarkable 28.83\% improvement over the baseline model.

dataset, fact verification, verification, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-031-74127-2_19

2503.00356

Country:

Asia > Vietnam > Hanoi > Hanoi (0.14)
Asia > Vietnam > Bắc Ninh Province > Bắc Ninh (0.05)
Asia > Vietnam > Hồ Chí Minh City > Hồ Chí Minh City (0.05)
(4 more...)

Genre: Research Report (0.85)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

CellViT++: Energy-Efficient and Adaptive Cell Segmentation and Classification Using Foundation Models

Hörst, Fabian, Rempe, Moritz, Becker, Helmut, Heine, Lukas, Keyl, Julius, Kleesiek, Jens

arXiv.org Artificial IntelligenceJan-9-2025

Digital Pathology is a cornerstone in the diagnosis and treatment of diseases. A key task in this field is the identification and segmentation of cells in hematoxylin and eosin-stained images. Existing methods for cell segmentation often require extensive annotated datasets for training and are limited to a predefined cell classification scheme. To overcome these limitations, we propose $\text{CellViT}^{{\scriptscriptstyle ++}}$, a framework for generalized cell segmentation in digital pathology. $\text{CellViT}^{{\scriptscriptstyle ++}}$ utilizes Vision Transformers with foundation models as encoders to compute deep cell features and segmentation masks simultaneously. To adapt to unseen cell types, we rely on a computationally efficient approach. It requires minimal data for training and leads to a drastically reduced carbon footprint. We demonstrate excellent performance on seven different datasets, covering a broad spectrum of cell types, organs, and clinical settings. The framework achieves remarkable zero-shot segmentation and data-efficient cell-type classification. Furthermore, we show that $\text{CellViT}^{{\scriptscriptstyle ++}}$ can leverage immunofluorescence stainings to generate training datasets without the need for pathologist annotations. The automated dataset generation approach surpasses the performance of networks trained on manually labeled data, demonstrating its effectiveness in creating high-quality training datasets without expert annotations. To advance digital pathology, $\text{CellViT}^{{\scriptscriptstyle ++}}$ is available as an open-source framework featuring a user-friendly, web-based interface for visualization and annotation. The code is available under https://github.com/TIO-IKIM/CellViT-plus-plus.

dataset, sd 0, segmentation, (15 more...)

arXiv.org Artificial Intelligence

2501.05269

Country:

Europe > United Kingdom > England > Warwickshire (0.04)
Europe > Switzerland (0.04)
North America > United States > New York (0.04)
(8 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.93)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.92)
(3 more...)

Add feedback

Motor Imagery Classification for Asynchronous EEG-Based Brain-Computer Interfaces

Wu, Huanyu, Li, Siyang, Wu, Dongrui

arXiv.org Artificial IntelligenceDec-12-2024

Motor imagery (MI) based brain-computer interfaces (BCIs) enable the direct control of external devices through the imagined movements of various body parts. Unlike previous systems that used fixed-length EEG trials for MI decoding, asynchronous BCIs aim to detect the user's MI without explicit triggers. They are challenging to implement, because the algorithm needs to first distinguish between resting-states and MI trials, and then classify the MI trials into the correct task, all without any triggers. This paper proposes a sliding window prescreening and classification (SWPC) approach for MI-based asynchronous BCIs, which consists of two modules: a prescreening module to screen MI trials out of the resting-state, and a classification module for MI classification. Both modules are trained with supervised learning followed by self-supervised learning, which refines the feature extractors. Within-subject and cross-subject asynchronous MI classifications on four different EEG datasets validated the effectiveness of SWPC, i.e., it always achieved the highest average classification accuracy, and outperformed the best state-of-the-art baseline on each dataset by about 2%.

classification, module, swpc, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TNSRE.2024.3356916

2412.09006

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Guangdong Province > Shenzhen (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
(5 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Cognitive Science > Neuroscience (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

E2E-AFG: An End-to-End Model with Adaptive Filtering for Retrieval-Augmented Generation

Jiang, Yun, Xie, Zilong, Zhang, Wei, Fang, Yun, Pan, Shuai

arXiv.org Artificial IntelligenceNov-1-2024

Retrieval-augmented generation methods often neglect the quality of content retrieved from external knowledge bases, resulting in irrelevant information or potential misinformation that negatively affects the generation results of large language models. In this paper, we propose an end-to-end model with adaptive filtering for retrieval-augmented generation (E2E-AFG), which integrates answer existence judgment and text generation into a single end-to-end framework. This enables the model to focus more effectively on relevant content while reducing the influence of irrelevant information and generating accurate answers. We evaluate E2E-AFG on six representative knowledge-intensive language datasets, and the results show that it consistently outperforms baseline models across all tasks, demonstrating the effectiveness and robustness of the proposed approach.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2411.00437

Country: Asia > China (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Media > News (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Multimodal Learning and Reasoning for Visual Question Answering

Ilija Ilievski, Jiashi Feng

Neural Information Processing SystemsOct-4-2024, 10:47:18 GMT

Reasoning about entities and their relationships from multimodal data is a key goal of Artificial General Intelligence. The visual question answering (VQA) problem is an excellent way to test such reasoning capabilities of an AI model and its multimodal representation learning. However, the current VQA models are oversimplified deep neural networks, comprised of a long short-term memory (LSTM) unit for question comprehension and a convolutional neural network (CNN) for learning single image representation. We argue that the single visual representation contains a limited and general information about the image contents and thus limits the model reasoning capabilities. In this work we introduce a modular neural network model that learns a multimodal and multifaceted representation of the image and the question. The proposed model learns to use the multimodal representation to reason about the image entities and achieves a new state-of-the-art performance on both VQA benchmark datasets, VQA v1.0 and v2.0, by a wide margin.

classification module, module, representation, (16 more...)

Neural Information Processing Systems

Country:

Asia > Singapore (0.05)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

ECG Arrhythmia Detection Using Disease-specific Attention-based Deep Learning Model

Jin, Linpeng

arXiv.org Artificial IntelligenceJul-25-2024

The electrocardiogram (ECG) is one of the most commonly-used tools to diagnose cardiovascular disease in clinical practice. Although deep learning models have achieved very impressive success in the field of automatic ECG analysis, they often lack model interpretability that is significantly important in the healthcare applications. To this end, many schemes such as general-purpose attention mechanism, Grad-CAM technique and ECG knowledge graph were proposed to be integrated with deep learning models. However, they either result in decreased classification performance or do not consist with the one in cardiologists' mind when interpreting ECG. In this study, we propose a novel disease-specific attention-based deep learning model (DANet) for arrhythmia detection from short ECG recordings. The novel idea is to introduce a soft-coding or hard-coding waveform enhanced module into existing deep neural networks, which amends original ECG signals with the guidance of the rule for diagnosis of a given disease type before being fed into the classification module. For the soft-coding DANet, we also develop a learning framework combining self-supervised pre-training with two-stage supervised training. To verify the effectiveness of our proposed DANet, we applied it to the problem of atrial premature contraction detection and the experimental results shows that it demonstrates superior performance compared to the benchmark model. Moreover, it also provides the waveform regions that deserve special attention in the model's decision-making process, allowing it to be a medical diagnostic assistant for physicians.

attention weight, ecg signal, module, (11 more...)

arXiv.org Artificial Intelligence

2407.18033

Country:

Asia > China > Zhejiang Province > Hangzhou (0.05)
Asia > China > Anhui Province > Hefei (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(3 more...)

Genre: Research Report > New Finding (0.86)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Multi-module Robust Method for Transient Stability Assessment against False Label Injection Cyberattacks

Wang, Hanxuan, Lu, Na, Liu, Yinhong, Wang, Zhuqing, Wang, Zixuan

arXiv.org Artificial IntelligenceJun-10-2024

The success of deep learning in transient stability assessment (TSA) heavily relies on high-quality training data. However, the label information in TSA datasets is vulnerable to contamination through false label injection (FLI) cyberattacks, resulting in degraded performance of deep TSA models. To address this challenge, a Multi-Module Robust TSA method (MMR) is proposed to rectify the supervised training process misguided by FLI in an unsupervised manner. In MMR, a supervised classification module and an unsupervised clustering module are alternatively trained to improve the clustering friendliness of representation leaning, thereby achieving accurate clustering assignments. Leveraging the clustering assignments, we construct a training label corrector to rectify the injected false labels and progressively enhance robustness and resilience against FLI. However, there is still a gap on accuracy and convergence speed between MMR and FLI-free deep TSA models. To narrow this gap, we further propose a human-in-the-loop training strategy, named MMR-HIL. In MMR-HIL, potential false samples can be detected by modeling the training loss with a Gaussian distribution. From these samples, the most likely false samples and most ambiguous samples are re-labeled by a TSA experts guided bi-directional annotator and then subjected to penalized optimization, aimed at improving accuracy and convergence speed. Extensive experiments indicate that MMR and MMR-HIL both exhibit powerful robustness against FLI in TSA performance. Moreover, the contaminated labels can also be effectively corrected, demonstrating superior resilience of the proposed methods.

assignment, false label, mmr -hil, (15 more...)

arXiv.org Artificial Intelligence

2406.06744

Country:

Asia > China > Shaanxi Province > Xi'an (0.04)
Asia > Singapore (0.04)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Energy > Power Industry (0.92)
Government > Military > Cyberwarfare (0.84)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback